
Almost a week ago, I attempted to create my own version of Iron Man’s J.A.R.V.I.S. – a personal assistant that can perform tasks and hold chatbot-like conversations. After some research, I found AIML and its Python interpreter. I also discovered ways to convert text to speech and speech to text using Python libraries like pyttsx and SpeechRecognition. Before long, I had a J.A.R.V.I.S. that understood some of what I said and performed a few fun actions.
Configuring voice input and output:
Text to Speech:
With pyttsx installed through pip install pyttsx or otherwise,
import pyttsx
def speak(jarvis_speech):
engine = pyttsx.init()
engine.say(jarvis_speech)
engine.runAndWait()
speak("Hello world")This should produce an audio output of “Hello World” in a slightly mechanical voice.
Speech-to-Text:
With SpeechRecognition installed through pip install SpeechRecognition or otherwise,
import speech_recognition as sr
def listen():
r = sr.Recognizer()
with sr.Microphone() as source:
print("Talk to J.A.R.V.I.S: ")
audio = r.listen(source)
try:
print r.recognize_google(audio)
return r.recognize_google(audio)
except sr.UnknownValueError:
offline_speak("I couldn't understand what you said! Would you like to repeat?")
return(listen())
except sr.RequestError as e:
print("Could not request results from Google Speech Recognition service; {0}".format(e))
listen()This should listen to an audio input once from the microphone. However this can produce a series of warnings and errors. Resolution for some errors can be found in the Troubleshooting section of this page.
Working with AIML:
AIML stands for Artificial Intelligence Markup Language. It’s an XML-based markup language designed for creating conversational AI applications. Using AIML, implementations remain easy to program and highly maintainable. The catch is that AIML development has become rather stagnant – AIML 1.0 (the version with a Python interpreter) has documentation dating back to 2005, making it fairly obsolete. Nevertheless, AIML is easy to learn and genuinely fun to tinker with. At its core, AIML recognizes patterns and gives responses as programmed. Patterns are called categories and responses are called templates. A basic AIML snippet looks like this:
<aiml version="1.0.1" encoding="UTF-8"?>
<category>
<pattern>HELLO</pattern>
<template>
Well, hello!
</template>
</category>
</aiml>This small piece of code allows J.A.R.V.I.S. to reply with “Well, hello!” for every “HELLO” it gets as input.
This can be customised to give a random output from a list of templates, redirect to an already defined pattern, use part or all of the user’s input, run shell scripts, and more. Out of all these, running shell scripts is what makes AIML – or at least J.A.R.V.I.S. – genuinely powerful.
All one has to do to create a chatbot using AIML, is to make up a list of .aiml files containing categories like the one defined above and use the aiml library for python to import them into a kernel using a startup.xml file which looks like this:
<aiml version="1.0.1" encoding="UTF-8">
<category>
<pattern>LOAD AIML</pattern>
<template>
<learn>*.aiml</learn>
</template>
</category>
</aiml>This should allow one to load up all the aiml files present in the working directory to the created kernel. A simple python script to do is:
import aiml
kernel = aiml.Kernel()
kernel.learn("startup.xml")
kernel.respond("load aiml")
while True:
print kernel.respond(raw_input("Talk to J.A.R.V.I.S: "))A good guide to get started with AIML using the python interpreter can be found here.
Using the <system> tag:
The system tag in AIML is a very powerful tag. It allows the script to run a shell command and use its output. A very simple example of this is:
<category>
<pattern>WHAT TIME IS IT</pattern>
<template>
The time is <system>date "+%l:%M %P"</system>
</template>
</category>Running date "+%l:%M %P" in shell returns the current time in a 12 hour format like “6:12 PM”. This can be used as a template output for the question “What time is it?”.
Using the <system> tag one can achieve a lot of functionalities like opening applications, killing running processes, changing volume levels, adjusting brightness, changing wallpapers etc. And J.A.R.V.I.S. does quite a few of them.
Integrating Python scripts
Since it’s easy enough to use shell commands through AIML, it’s equally easy to run Python scripts and use their output. The best feature of J.A.R.V.I.S. – and the one I’m most proud of – is the following.
Finding and playing a song on youtube
With the following python script, to find the link for first result for a given query:
import urllib
import urllib2
from bs4 import BeautifulSoup
import sys
flag = 0
query = sys.argv[1].strip("\"").replace(" ","+")
url = "https://www.youtube.com/results?search_query=" + query
response = urllib2.urlopen(url)
html = response.read()
soup = BeautifulSoup(html,"lxml")
for vid in soup.findAll(attrs={'class':'yt-uix-tile-link'}):
if ('https://www.youtube.com' + vid['href']).startswith("https://www.youtube.com/watch?v="):
flag = 1
print 'https://www.youtube.com' + vid['href']
if flag == 0:
print "https://www.youtube.com"and this AIML file which opens the link provided by the above script in chromium-browser:
<category>
<pattern>PLAY SONG *</pattern>
<template>
<random>
<li>Sure thing! </li>
<li>Right away, sir! </li>
<li>On it! </li>
</random>
<system> chromium-browser "<system> python youtube.py "<star/>"</system>"</system>
</template>
</category>the following output can be achieved:
Talk to J.A.R.V.I.S : play song Killswitch Engage This Fire Burns
J.A.R.V.I.S: Sure thing! Created new window in existing browser session.
With this song opened in a new chromium-browser tab.
I have improved the above AIML code to produce this output:
Talk to J.A.R.V.I.S : play me a song
J.A.R.V.I.S : What song, sir?
Talk to J.A.R.V.I.S : Killswitch Engage This Fire Burns
J.A.R.V.I.S : On it! Created new window in existing browser session.
The repository for J.A.R.V.I.S. lies here. Feel free to fork it and customise it. Create a PR if you feel you have an interesting feature that could be added.
Happy coding!